**EEL 6764 001: Graduate Computer Architecture**

**Spring 2025**

**Instructor: Dr. Srinivas Katkoori**

**Homework 1**

Assigned on Monday, 27th January.

DUE: 11:59:59PM, Monday, 10th February via Canvas

Upload your solutions in PDF format.

No late work will be accepted.

1. (20 pts) Amdahl’s Law – Solve Problem 1.15 (a, b, c, d) on pages 74 – 75.
2. (10 pts) Decide whether each of the following is true or false. Add brief explanation (1-2 sentences) to get full credit.
   1. The performance of the system is limited by the fastest component even if some components are made 7X slower.
   2. You can pay attention to Amdahl’s law because it is still applicable.
   3. CPI rating of a processor is a good metric to measure its performance.
   4. The future of performance improvement will be mostly dependent on parallelization of programming rather than blindly adding multiple cores to a chip.
3. (10 pts) Fabrication Cost: Solve problem 1.2 of Case Study 1 on page 68.

Assume Wafer Yield is 100%.

Hint: Review textbook examples on Pages 33 and 34.

1. (20 pts) A cell phone performs very different tasks, including streaming music, streaming video, and reading email. These tasks perform very different computing tasks. Battery life and overheating are two common problems for cell phones, so reducing power and energy consumption are critical. In this problem, we consider what to do when the user is not using the phone to its full computing capacity. For these problems, we will evaluate an unrealistic scenario in which the cell phone has no specialized processing units. Instead, it has a quad-core, general purpose processing unit. Each core uses 1 W at full use. For email-related tasks, the quad-core is 8X as fast as necessary.

Hint: Review textbook example on Page 25.

1. How much dynamic energy and power are required compared to running at full power? First, suppose that the quad-core operates for 1/2 of the time and is idle for the rest of the time. That is, the clock is disabled for 1/2 of the time, with no leakage occurring during that time. Compare total dynamic energy as well as dynamic power while the core is running.
2. How much dynamic energy and power are required using frequency and voltage scaling? Assume frequency and voltage are both reduced to 1/4 the entire time.
3. Now assume the voltage may not decrease below 70% of the original voltage. This voltage is referred to as the voltage floor, and any voltage lower than that will lose the state. Therefore, while the frequency can keep decreasing, the voltage cannot. What are the dynamic energy and power savings in this case?
4. How much energy is used with a dark silicon approach? This involves creating specialized ASIC hardware for each major task and power gating those elements when not in use. Only one general-purpose core would be provided, and the rest of the chip would be filled with specialized units. For email, the one core would operate for 25% the time and be turned completely off with power gating for the other 75% of the time. During the other 75% of the time, a specialized ASIC unit that requires 20% of the energy of a core would be running.
5. (10 pts) Effective CPI: Solve problem A.3 in Appendix A on page A-48.
6. (10 pts) CISC Architecture: Pick any CISC architecture and examine its instruction set architecture and answer the following questions. You can search online for the ISA summary for the processor you have chosen. **Cite the source(s) you have used for this problem.** 
   1. What makes it a CISC Architecture?
   2. List all instructions that you think are “complex” in nature?
   3. List all addressing modes supported by the processor.
7. (10 pts) We begin with a computer implemented in single-cycle implementation. When the stages are split by functionality, the stages do not require exactly the same amount of time. The original machine had a clock cycle time of 13 ns. After the stages were split, the measured times were IF, 2 ns; ID, 3.5 ns; EX, 1 ns; MEM, 4 ns; and WB, 2.5 ns. The pipeline register delay is 0.1 ns.
   1. What is the clock cycle time of the 5-stage pipelined machine?
   2. If there is a stall every four instructions, what is the CPI of the new machine?
   3. What is the speedup of the pipelined machine over the single-cycle machine?
   4. If the pipelined machine had an infinite number of stages, what would its speedup be over the single-cycle machine?
8. (10 pts) What are the four types of architecture according to Flynn’s taxonomy? For each architecture, identify what kinds of parallelism are exploited?